-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add regex_program
strings splitting java APIs and tests
#12713
Add regex_program
strings splitting java APIs and tests
#12713
Conversation
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.04 #12713 +/- ##
===============================================
Coverage ? 85.81%
===============================================
Files ? 158
Lines ? 25153
Branches ? 0
===============================================
Hits ? 21586
Misses ? 3567
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
/** | ||
* Returns a list of columns by splitting each string using the specified regex program pattern. | ||
* The number of rows in the output columns will be the same as the input column. Null entries | ||
* are added for a row where split results have been exhausted. Null input entries result in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* are added for a row where split results have been exhausted. Null input entries result in | |
* are added for the rows where split results have been exhausted. Null input entries result in |
/** | ||
* Returns a list of columns by splitting each string using the specified regex program pattern. | ||
* The number of rows in the output columns will be the same as the input column. Null entries | ||
* are added for a row where split results have been exhausted. Null input entries result in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* are added for a row where split results have been exhausted. Null input entries result in | |
* are added for the rows where split results have been exhausted. Null input entries result in |
* corresponding rows of the output columns. | ||
* Returns a list of columns by splitting each string using the specified string literal | ||
* delimiter. The number of rows in the output columns will be the same as the input column. | ||
* Null entries are added for a row where split results have been exhausted. Null input entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Null entries are added for a row where split results have been exhausted. Null input entries | |
* Null entries are added for the rows where split results have been exhausted. Null input entries |
/** | ||
* Returns a list of columns by splitting each string using the specified regular expression | ||
* pattern. The number of rows in the output columns will be the same as the input column. | ||
* Null entries are added for a row where split results have been exhausted. Null input entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Null entries are added for a row where split results have been exhausted. Null input entries | |
* Null entries are added for the rows where split results have been exhausted. Null input entries |
auto const column_view = reinterpret_cast<cudf::column_view const *>(input_handle); | ||
auto const strings_column = cudf::strings_column_view{*column_view}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to avoid column_view
name as it may clash with cudf::column_view
.
auto const column_view = reinterpret_cast<cudf::column_view const *>(input_handle); | |
auto const strings_column = cudf::strings_column_view{*column_view}; | |
auto const input = reinterpret_cast<cudf::column_view const *>(input_handle); | |
auto const strings_column = cudf::strings_column_view{*input}; |
@@ -735,22 +754,43 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_stringSplitRecord(JNIEnv | |||
|
|||
try { | |||
cudf::jni::auto_set_device(env); | |||
auto const input = reinterpret_cast<cudf::column_view *>(input_handle); | |||
auto const strs_input = cudf::strings_column_view{*input}; | |||
auto const column_view = reinterpret_cast<cudf::column_view const *>(input_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above.
|
||
try { | ||
cudf::jni::auto_set_device(env); | ||
auto const column_view = reinterpret_cast<cudf::column_view const *>(input_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! All changes updated.
Signed-off-by: Cindy Jiang <[email protected]>
/merge |
Description
This PR adds split_re, rsplit_re, split_record_re, rsplit_record_re related
regex_program
java APIs and unit tests.Part of work for NVIDIA/spark-rapids#7295.
Checklist